thesaurus construction project for the persian manuscripts
نویسندگان
چکیده
purpose: manuscripts as written works of past generations are important collections in research and university libraries in iran. they convey useful information about different subject areas. the need to this enormous amount of information emphasis on their organization and the application of the new electronic information technologies. methodology: regarding to the presence of more than 5,000 special vocabularies in the field of manuscript, it is essential to control and organize the terms according to their subject limitations in a systematic design. since, thesaurus is one of the controlling tools of words and information terms in this subject then preparing an electronic comprehensive thesaurus in addition to published version is recommended. therefore, a plan for thesaurus and its structure and the limitation of codicology terms is surveyed. finding: the result of survey for the first time in iran is a structure suggested for the thesaurus of codicology according to available samples and a brief example of thesaurus has been rendered.
منابع مشابه
Improving Persian Text Classification and Clustering Using Persian Thesaurus
This paper proposes an innovative approach to improve the classification performance of Persian texts. The proposed method uses a thesaurus as a helpful knowledge to obtain more representative word-frequencies in the corpus. Two types of word relationships are considered in our used thesaurus. This is the first attempt to use a Persian thesaurus in the field of Persian information retrieval. Ex...
متن کاملSpectral Methods for Thesaurus Construction
Traditionally, popular synonym acquisition methods are based on the distributional hypothesis, and a metric such as Jaccard coefficients is used to evaluate the similarity between the contexts of words to obtain synonyms for a query. On the other hand, when one tries to compile and clean a thesaurus, one often already has a modest number of synonym relations at hand. Could something be done wit...
متن کاملAutomatic thesaurus construction
In this paper we introduce a novel method of automating thesauri using syntactically constrained distributional similarity. With respect to syntactically conditioned cooccurrences, most popular approaches to automatic thesaurus construction simply ignore the salience of grammatical relations and effectively merge them into one united ‘context’. We distinguish semantic differences of each syntac...
متن کاملAutomatic thesaurus construction
One of the major problems of modern Information Retrieval (IR) systems is the vocabulary problem that concerns the discrepancies between terms used for describing documents and the terms used by the searchers to describe their information need. A way of handling the vocabulary problem is by using a thesaurus, which shows (usually semantic) relationships between terms. Three approaches for autom...
متن کاملPLSI Utilization for Automatic Thesaurus Construction
When acquiring synonyms from large corpora, it is important to deal not only with such surface information as the context of the words but also their latent semantics. This paper describes how to utilize a latent semantic model PLSI to acquire synonyms automatically from large corpora. PLSI has been shown to achieve a better performance than conventional methods such as tf·idf and LSI, making i...
متن کاملAutomatic Thesaurus Construction for Information Retrieval
The Thesaurus, in information retrieval, represents one of the cardinal points of a system, since neither indexing quality nor retrieval strategy sophistication are able to remedy deficiencies of the Thesaurus used. In principle, the Thesaurus is a collection of concepts which are more or less important for the subject field of the document collection. Usually the mere list of the terms represe...
متن کاملمنابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
تحقیقات کتابداری و اطلاع رسانی دانشگاهیجلد ۴۰، شماره ۴۶، صفحات ۰-۰
کلمات کلیدی
میزبانی شده توسط پلتفرم ابری doprax.com
copyright © 2015-2023